Developing Serverless Solutions on AWS
| Monitoring (was) | Observability (is) |
|---|---|
| Watching layers of your stack | Visibility into the system as a whole |
| Identifying failures via probing | Understanding if system behaves as expected |
| Focusing on metrics | Gaining insight: usage, experience, trends, root cause |
| Reactive (alert when broken) | Proactive (understand before it breaks) |
Distributed serverless applications make observability even more critical - no servers to SSH into, many small components.
# Python - structured JSON logging (recommended!)
import json
message = {
"level": "INFO",
"timestamp": "2024-12-11T12:44:40.300Z",
"requestId": "abc-123",
"orderId": "ORD-456",
"action": "AddToCart",
"quantity": 2,
"productId": "a23390f3",
"environment": "prod"
}
print(json.dumps(message))
# CloudWatch Logs Insights can then query:
# fields @timestamp, orderId, action
# | filter level = "ERROR"
# | sort @timestamp desc
# | limit 20
# Python - instrument your Lambda with X-Ray
from aws_xray_sdk.core import xray_recorder
from aws_xray_sdk.core import patch_all
# Patch all AWS SDK clients to capture traces automatically
patch_all()
def handler(event, context):
# All boto3 calls are now traced automatically!
# Add custom subsegments for your own code:
subsegment = xray_recorder.begin_subsegment('process_order')
try:
result = process_order(event)
subsegment.put_annotation('orderId', event['orderId'])
subsegment.put_metadata('result', result)
finally:
xray_recorder.end_subsegment()
return result
| Metric | What It Tells You | Alarm On |
|---|---|---|
| Errors | Failed invocations | Any increase above 0 |
| Duration | How long function runs | Approaching timeout |
| Throttles | Concurrency limit hit | Any occurrence |
| ConcurrentExecutions | Active instances | Near account limit |
| IteratorAge | Stream processing lag | Growing over time |
# Embedded Metrics Format (EMF) - generate metrics from logs!
# No putMetricData API call needed - just print structured JSON
import json
from datetime import datetime
metric_log = {
"_aws": {
"Timestamp": int(datetime.now().timestamp() * 1000),
"CloudWatchMetrics": [{
"Namespace": "MyApp/Orders",
"Dimensions": [["Environment", "Region"]],
"Metrics": [
{"Name": "OrderCount", "Unit": "Count"},
{"Name": "OrderValue", "Unit": "None"}
]
}]
},
"Environment": "prod",
"Region": "us-west-2",
"OrderCount": 1,
"OrderValue": 149.99
}
print(json.dumps(metric_log))
# CloudWatch automatically extracts OrderCount and OrderValue as metrics!
| Metric | Why It Matters |
|---|---|
| memory_utilization | Are you over/under-provisioned? |
| init_duration | How bad are cold starts? |
| cpu_total_time | Is function CPU-bound? |
| rx/tx_bytes | Network I/O bottlenecks |
The X-Ray trace map integrates X-Ray and CloudWatch, providing access to logs, metrics, and alarms from a single interface.
Deploy a Lambda + API Gateway with full observability, then show logs, traces, and metrics.
# observability_demo.py
import json, os, time, boto3
from aws_xray_sdk.core import xray_recorder, patch_all
patch_all() # Auto-trace all AWS SDK calls
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(os.environ.get('TABLE_NAME', 'demo-orders'))
def handler(event, context):
start = time.time()
order_id = event.get('queryStringParameters', {}).get('id', 'ORD-001')
# Structured log (JSON)
print(json.dumps({"level": "INFO", "action": "GetOrder", "orderId": order_id}))
# Custom X-Ray subsegment
with xray_recorder.in_subsegment('fetch_order') as subseg:
subseg.put_annotation('orderId', order_id)
item = table.get_item(Key={'orderId': order_id})
duration = (time.time() - start) * 1000
print(json.dumps({"level": "INFO", "action": "Complete", "duration_ms": round(duration)}))
return {"statusCode": 200, "body": json.dumps(item.get('Item', {}))}
# Create function with X-Ray enabled
zip observability_demo.zip observability_demo.py
aws lambda create-function --function-name observability-demo \
--runtime python3.12 --handler observability_demo.handler \
--role arn:aws:iam::ACCOUNT:role/lambda-xray-role \
--zip-file fileb://observability_demo.zip \
--environment Variables='{TABLE_NAME=demo-orders}' \
--tracing-config Mode=Active \
--logging-config LogFormat=JSON,ApplicationLogLevel=INFO
# Create DynamoDB table
aws dynamodb create-table --table-name demo-orders \
--attribute-definitions AttributeName=orderId,AttributeType=S \
--key-schema AttributeName=orderId,KeyType=HASH \
--billing-mode PAY_PER_REQUEST
# Put test item
aws dynamodb put-item --table-name demo-orders \
--item '{"orderId":{"S":"ORD-001"},"item":{"S":"Laptop"},"amount":{"N":"999"}}'
# Create HTTP API + invoke
aws apigatewayv2 create-api --name observability-demo-api --protocol-type HTTP
# (attach integration + route, then test)
# Invoke directly to generate traces
aws lambda invoke --function-name observability-demo \
--payload '{"queryStringParameters":{"id":"ORD-001"}}' output.json
| Tool | What to Demonstrate |
|---|---|
| CloudWatch Logs | Show JSON structured logs, filter by orderId, run Logs Insights query |
| X-Ray Traces | Click service map, drill into trace, show Lambda + DynamoDB segments with latency |
| CW Metrics | Show Lambda Duration, Invocations, Errors graphs |
| Lambda Insights | Enable enhanced monitoring, show memory/CPU utilization |
| Alarms | Create alarm on Errors > 0, show SNS notification |
fields @timestamp, action, orderId, duration_ms | filter level = "INFO" | sort @timestamp desc | limit 20